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Abstract. - The probability distribution P{k) of the sizes k of critical trees (branching ratio 
m = 1) is well known to show a power-law behavior k~'^l^ . Such behavior corresponds to the 
mean-field approximation for many critical and self-organized critical phenomena. Here we 
show numerically and analytically that also supercritical trees (branching ration m > 1) are 
"critical" in that their size distribution obeys a power-law We mention some possible 

applications of these results. 



Introduction. - Ideal trees, either regular or random, play a great role in the description 
of natural phenomena: many systems, from rivers [|l|, ^ to blood vessels and lungs ^, 
to real trees can be described by geometrical branched structures. Other phenomena can 
be described by branched structures in time (i.e. branching processes), with cascades of 
events generating new events in a multiplicative fashion: examples span from physics i^e.g. 
nuclear chain reactions and directed percolation) to biology (speciation) ^,|^ and many 
others disciplines. More recently, trees (and branching processes) have been used to model 
the mean-field approximation of many non-equilibrium, self-organized critical systems such 
as the Bak-Tang-Wiesenfeld sandpile and the Bak-Sneppen model H,^. The dynamics of 
these models has a branching structure, where some local events can trigger more events 
of the same kind, with some rules given by the branching probability distribution (i.e. the 
probability p„ to trigger n new events) and by the geometric constraints induced by the finite 
spatial dimension d (that is, different branches can interact when coming to the same region). 
To model branching processes in infinite dimension (mean-field) the usual assumption is that 
there are no interactions between different branches: the statistical properties of branching, 
Pn, are preserved, but any spatial constraints are lost. In particular, the mean-field limit of the 
above mentioned critical models is recovered considering critical branching processes |p^ , [Tl[| . 
A critical branching process is characterized by an average branching ratio m = npn — 1, 
so that every generation of branching is (on the average) identical to the preceding ones. 
What people are usually interested in is the probability distribution P(fc) of the tree sizes fc, 
that is, the number of sites (or of branching events, in a branching process jargon) making 
up the trees. It is a well-known result of branching process theory that P(fc) ~ fc~^/^ for 
critical trees . This power-law behavior is consistent with the assumption of criticality, as 
common wisdom suggests. 

© EDP Sciences 



2 



EUROPHYSICS LETTERS 



Interestingly enough, much less is known about the size distribution of supercritical trees. 
In this paper we show, through simulations and analytical arguments, that also the supercriti- 
cal case (to > 1) exhibits a power-law distribution of tree sizes P{k) k~^. Some connections 
of this result to the structure of the Internet and to taxonomic systems are proposed. 

Critical and Supercritical Trees: Numerical and Analytical Results. - Starting from a root 
site (generation 0), a random tree is grown letting every site at generation t {0 < t < tmax) 
branch into n new sites (generation i -f 1) with probability p„. An example of a random tree 
is shown in Fig.^. We are interested in the size distribution of the subtrees: picking a site at 
random on the tree, what is the probability P{k) that the subtree that is rooted on it has size 
kl In Fig.0 such sizes are also marked for every site. Sites at generation tj^ax £^re assigned a 
size 1 (just themselves). 

We have performed simulations for different choices of p„, both with m = 1, finding the 
known result P{k) ^ k~^/'^ and with m > 1, finding P{k) ^ fc^^, as announced above (see 
Fig.|). 

To get an analytical insight in the origin of this behavior, we must decompose P{k) in 
generation dependent probabilities Pt{k) {Pt{k) = if i > tmax since trees are grown up to 
tmax)- We write explicitly the way in which generation dependent quantities sum up to give 
the tree distribution P{k): 

In (|l|) Nt{k) is the number of sites at generation t that are roots of a subtree of size fc, Nt 
is the total number of sites at generation i, Ntot is the total number of sites on the tree. 
< • > indicates the average over many realization of the system. Indeed, iVt, Ntot and Nt{k) 
change from realization to realization. As a first approximation we assume that, over many 
realizations, the generation dependent quantities converge to their average values: Nt = to*, 
Ntot = [m*""'^'^^ — 1)/(to — 1). Then we can write 

P{k) = (to - 1) X; ^,„.r+i_i -P*(fe) ^ - 1) E -*+i)p*(fc) (2) 

t=o t=o 

where the last equality holds in the limit of large tmax- We checked the reliability of such 
approximation, simulating the same process on regularly growing lattices. Starting at gener- 
ation with Nq sites, then Nt = m'^No (approximating it to the closest integer). Then each 
site at generation Nt+i is assigned to an ancestor in generation t. The branching distribution 
is then the binomial 



Nt+i \ f l\ 1 



Nt+i-n 

that in the limit Nt ^ 1 becomes the Poisson distribution p„ — e^™TO"/n!, independent from 
t. This process gives the same results as the simulation of the genuine random trees, that is 
P{k) ~ if TO = 1, P{k) ^ A;~^ if to > 1, as shown in Fig]|. 

Problem (^) can be formally solved using the recursion relation for the generating functions 
of Pt{k). Indeed we can write a relation between the probabilities Pt{k) and Pt+i{k): 

k k 

Pt{k) = - E ^*+i(fci) • - ■ Pt+l{kn)Sk,+...+k„+l^k (4) 
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We define the generating functions Gt{z) — J2k-Pt{k)z'' and f{z) — X^nP"^"- Then, 
multiplying both sides of (Q) by z'^ and summing over k, we obtain a recursion relation for 
the generating functions at successive generations 

Gt{z) = zf[Gt+i{z)] (5) 

that is a very well known result from branching process theory . Eq. (|^) can be formally 
solved using the "initial condition" ^t„„^(fc) = 6k,i that translates into Gt^^^{z) = z. Then 
we can iterate Eq.^ back to generation to have an explicit expression of every Gt{z), and, 
from (||) obtain the generating function G{z) — '^}^P{k)z^ . Knowing G{z) we could, in 
principle, get the asymptotic behavior of P{k): if P{k) ~ fc^'^ for large fc, then G{z) ~ 
1 — c(l — zY~^ for z — > 1. Unfortunately, if P{k) ^ fc^^, then we could expect a non-analytic 
behavior G{z) ^ 1 — c(l — z) In (1 — z), that cannot be obtained from an expansion of G{z) 
for z — > 1. Instead, we should expand around z = 0, and resum the terms power-by-power 
of z, which entails to computing explicitly P{k) directly from (^. Such an observation is of 
some relevance since in the critical case m = 1, invoking the time-translational invariance of 
the process (that is, setting Pt{k) = Pt+iik) = P{k) so that Gt{z) = Gt+i{z) = G{z)), Eq.H 
becomes G{z) = zf[G{z)]. Expanding both hands for z — > 1 and using the corresponding form 
G{z) ~ 1 — c(l — zy~^, one can find that to match all the powers of (1 — z) on both sides the 
only choice is r = 3/2. Some simple cases for which G(z) = zf[G{z)] can be solved exactly 
are given by the branching probabilities = if n > 2, with pi + 2p2 = 1. In these cases the 
recursion equation reduces to a second order equation in G(z), that can be readily solved to 
explicitely show that the leading non-analytic behavior for z ^ 1 is given by (1 — z)^^'^, hence 

P{k) ~ fc-3/2. 

Still, even when the system is supercritical, the generating function formulation is not 
fruitless. Indeed it is possible to use it to compute the average subtree size < k >= J^k kP{k). 
Given the generating function G{z) it is easy to see that the average size is 



d ^ 



z—Giz 
dz 



= (m-l)^m-(*— -*+i)g;(1) . (6) 



Taking the derivative of (|^) we obtain a recursion relation for G[{\), 

G;(l) = l + mG;+i(l) . (7) 
With the "initial" condition G'l^^^ = 1, we obtain 

G;(1) = (8) 

m — 1 

from which wc finally get 

We find therefore that the average value < k > diverges as tmax ^ oo, a clear indication 
of a power-law behavior. Yet, it diverges as the logarithm of the maximal allowed size, that 
is kmax ~ 77i*m''»:+ij which is the average total number of sites on the tree. This is already a 
strong indication that asymptotically P{k) ~ k~^. 

We then look at the behavior of Pt(fc) from simulations. We plot in Figj^ the generation 
probabilities Pt{k) at generations t = 8, 10, 12 (trees are grown for 40 generations, with m = 
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1.2). They clearly decay exponentially for large values of k. Such an exponential decay is 
suggestive of the presence of a size kt characteristic of each generation. There is a simple and 
intuitive relation between Pt{k) and Pf+i(fc), namely, 

Pt+i{k) ^ mPt{mk) . (10) 

The relation between the arguments is readily understood thinking that, on the average, a site 
at generation t is the root of a subtree of size mk only if each of its m descendents at generation 
f + 1 is the root of a subtree of size k (and formally it is expressed by Eq.^) . The prefactor 
m is due to normalization. Interestingly, ( p^ is the generalization to the supercritical case of 
the time-translational invariance assumed to hold in the critical case m = 1. Relation (^ IS 
numerically verified, for t <^ tmaxj where the asymptotic behavior has been reached, as shown 
in the inset of Fig.^. 

Using ( p^ ) it is at last possible to have further evidence that P(fc) ~ k~^ asymptotically. 
We write 

P(mfc) = (m- 1) ^ TO"(*— "*+i)Pf(TOfc) 
= (m - 1) 2 ™~'*"°^~*+^^Pt+i(fc) 

= (m - 1) 2 TO"^*""'^"*+^^Pt(fc) ~ m~^P{k) (11) 



t=i 



where we have used Pt{k) = if t > tmax- The final equality is consistent with a power-law 
decay with exponent —2 for large k. 

To give further support to the idea that a functional form of Pt{k) that satisfies (^0|) 
generically implies inverse square decay, we assume the form Pt{k) = {1 — at)a^~^ with 
at = exp(-m-(*"'»--*)), that obeys Then 



t„ 



P{k) = {m-l)lY. a'-'m-^*— - ^ a^m-^*— \ (12) 
I t=o t=o J 

The second sum on the r.h.s of ( p2|) is 

^ TO ^ TO ^ 

t=0 t=0 t=0 



Going from sums to integrals we write 



In TO fc , ^ ., 

where we posed y — e^^™" , such that, in the limit of large t, Ay ~ m~*At and the 
switch from sums to integrals is justified. Moreover, the upper limit of integration, 1, comes 
after taking the limit tmax — > oo. After applying the same approximation on the first sum on 
the r.h.s of (p^), eventually we have 



^ k — 1 Jf,-(k-i) ^ k j„-k ^ k^ ^ ^ 
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for large values of k. 

A derivation of analogous results has been given in jl^ for trees with a fixed branching 
ratio. Such a derivation based on the Mellin transform, although extremely instructive, can- 
not be extended directly to variable branching ratios and in particular for a non zero death 
probability. 

Conclusions. - We have studied the size distribution P{k) of random trees and subtrees. 
Alongside with the well known result P{k) ~ k~^^^ for the case of critical trees (unitary 
branching ratio), we have found that also supercritical trees show a power-law behavior P{k) ~ 
k~^. Some occurences of this behavior have already been found. 

In taxonomy |^,^ , it has been found that the distribution of taxa according to the number 
of their subtaxa has a power-law behavior, with exponents scattered around 2. Since taxonomy 
is intrinsically related to a branch tree organization of genera, families, species etc., it is 
suggestive to think that such exponents could be related to the tree-like nature of the system 
itself, rather than to some underlying dynamical critical process. 

Even more recently it has been pointed out that such a power-law behavior is present 
also for trees generated by connections on the Internet jl^: there each node of the tree is 
a site on the Internet, and the branches correspond to the sites it is linked to (actually the 
structure of the Internet is that of a network, but search and trace programs superimpose to 
it a corresponding tree structure) . The number of sites on the Internet that collect an area 
(number of sites) k scales as k^^ with z close to 2. Using this result we can infer that the 
Internet, when looked at as a tree, is exponentially growing. 

Recently, the inverse-square law has emerged, for similar reasons, in marine ecology |l5| . 

From a more general viewpoint, the emergence of " criticality" in such a simple and well 
known framework as supercritical trees is suggestive of a whole new family of systems (webs 
and supercritical webs) still hiding interesting properties. Some preliminary simulations for 
directed webs show that this seems to be the case. 

* * * 
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Fig. 1 - A random Cayley tree grown for five generations (from to 4). The last generation sizes are 
set to 1, and all the other subtree sizes are also explicitly written. 
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Fig. 2 - P{k) for m = 1, m = 1.2 and m = L5 (for this last case we show the results both on random 
trees and on the regular exponentially growing lattice: both exhibit the same behavior); the trees are 
grown for 100 (m — 1), 50 (rrt — 1.2) and 30 (m = 1.5) generations, and averages are taken over 10'' 
(m = 1, 1.2) and 10^ (m = 1.5) realizations. The data are binned on intervals growing as powers of 2 
(the m = 1.2 data have also been shifted to fit together into the same graph). 
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k 

Fig. 3 - Generation distributions Pt{k) for t — 8,10,12 from upper to lower, respectively, in cor- 
respondence with the arrow. The average branching ratio is m = 1.2, and trees are grown for 40 
generations. Averages are taken over 10® trees. In the inset the distributions rescaled according to 
(10) are shown to nicely collapse. 



